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Generalized q-Cramer-Rao inequalities: the multidimensional case 



1. Introduction 

It is well known that the Gaussian distribution is a central distribution with respect to classical 
information measures and inequalities. In particular, the Gaussian distribution is both a maximum 
entropy and a minimum Fisher information distribution over all distributions with the same 
variance. We will show that the same kind of result holds for the family of generalized g-Gaussians, 
for Renyi or Tsallis entropy and a suitable extension of the Fisher information. 

These generalized g-Gaussians appear in statistical physics, where they are the maximum 
entropy distributions of the nonextensive thermostatistics [T|. The generalized g-Gaussian 
distributions define a versatile family that can describe problems with compact support as well 
as problems with heavy tailed distributions. They are also analytical solutions of actual physical 
problems, see [2], [3] [I], [5], and are sometimes known as Barenblatt-Pattle functions, following 
their identification by [5], [7]. We shall also mention that the Generalized g-Gaussian distributions 
appear in other fields, namely as the solution of non-linear diffusion equations, or as the distributions 
that saturate some sharp inequalities in functional analysis [8], [9], |10j . 

In the literature, and in particular within nonextensive thermostatistics, several extensions of 
the Fisher information and of the Cramer- Rao inequality have been proposed, e.g. [12], [13], [14] . 
|15| . |16| . In information theory, the remarkable work by Lutwak et al. [T7], [T5] also defines an 
extended Fisher information and a Cramer-Rao inequality saturated by g-Gaussian distributions. 
However the Fisher information is originally defined in a broader context as the information about 
a parameter of a parametric family of distributions. It is only in the special case of a location 
parameter that it reduces to the Fisher information of the distribution. The Fisher information is 
especially important for the formulation of the Cramer-Rao inequality. This well-known inequality 
appears in the context of estimation theory, where it defines a lower bound on the variance of any 
estimator of a parameter. 

In our recent work [19] . we have thrown a bridge between concepts in estimation theory 
and tools of nonextensive thermostatistics. Using the notion of escort distribution, we have 
established an extended version of the Cramer-Rao inequality for the estimation of a parameter. 
This new Cramer-Rao inequality includes the standard one, as well as Barankin-Vajda versions 
[2"Ul Corollary 5.1], [21] as particular cases. Furthermore, in the case of a location parameter, we 
have obtained extended versions of the standard Cramer-Rao inequality, which are saturated by 
the generalized g-Gaussians. This means that among all distributions with a given moment, the 
generalized g-Gaussians are also the minimizers of extended versions of the Fisher information, 
just as the standard Gaussian minimizes Fisher information over all distributions with a given 
variance. This result yields a new information-theoretic characterization of these generalized 
Gaussian distributions. 

However, a quite frustrating point is that these results were limited to the univariate case, 
while the multidimensional case is obviously of high importance. This restriction is overcomed in 
the present paper where we show that previous results can be extended to the multidimensional 
case. More than that, we consider an even wider context where moments of the error are computed 
with respect to two different probability distributions. In addition, giving our results for general 
norms will be hardly more difficult than for Euclidean norms, so we consider this general case from 
the beginning. Let us now give a brief overview of the results, together with the organization of 
the paper. 

Let 9 G G C R™ be a multidimensional parameter that we wish to estimate using data x. We 
show that for 9(x) an estimator of 9, if f(x; 9) and g(x; 9) are two probability densities, and if a 



Generalized q- Cramer- Rao inequalities: the multidimensional case 



3 



and (3 are Holder conjugates of each other, then 
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9{x) - 9 





IftVWAT > \n + V 9 .B f (6)\, 



(1) 



where |.| is a general norm on M. n , Vg.Bf(9) represents the divergence of the bias between 9(x) 
and 9, and Ip[f\g;9] stands for a generalized Fisher information that measures the information in 
/ about 9, and is taken with respect to g. This general result is established in section [3] Then we 
discuss in subsection 13.21 some special cases of this general inequality. In particular, if / and g is a 
pair of q-escort distributions, we obtain 



E 



9{x) - 9 



W/M] ? > n + V e .E q 9{x) 



E q 9{x)-9 " Ip, q [f\g;0]P > n + Ve.E 9{x)-9 



(2) 
(3) 



where Ip^ [f\g]9] is the generalized q)-Fisher information, and E q [.\ denotes the q-expectation 
which is used in nonextensive statistics. These results are the mutidimensional extensions, with an 
arbitrary norm, of our previous q-Cramer-Rao inequalities |19j . In the monodimensional case and 
q = 1, these inequalities reduce to the Barankin-Vajda Cramer- Rao inequality, and to the standard 
Cramer-Rao inequality for a = /3 = 2. In addition, in the case of a location parameter, we show 
that 



(4) 



which reduces again to our previous results in the univariate case. Examining carefully the cases for 
equality in we exhibit that the lower bound is attained by generalized q-Gaussian distributions, 
and we prove that these generalized Gaussian are the unique extremal functions, provided that the 
dual norm is strictly convex. For a random vector x in R™, these generalized q-Gaussian have the 
probability density 

'^y (l-Gz-lhHzinr 1 forq^l 
■exp (- 7 |M| a ) if 9=1 



G-y(x) 



i 

W<1 



(5) 



for a G (0, oo), 7 a real positive parameter and q > (n — o)/n, where we use the notation 
(x) + = max {1, 0}, and where Z(j) is the partition function For q > 1, the density has a 
compact support, while for q < 1 it is defined on the whole M. n and behaves as a power distribution 
for ||a;|| —¥ 00. A shorthand notation for the expression of the generalized q-Gaussian density is 



G 7 (x) 



1 



•exp,, (-711*11°% 



Z( 7 ) 

with q* — 2 — q, and where the so-called ^-exponential function is defined by 



(G) 



exp Jx) := (1 + (1 — q)x)+ 9 , for q ^ 1 and exp „ =1 (x) := exp(x). 



(7) 



§ In the case of the Euclidean norm, the general expressions of the main information measures attached to the 
generalized Gaussians are derived in Appendix A of 1221 . Similar expressions can readily be obtained in the case of 
a general norm, using the change of variable in polar coordinates x = ru, with u = and the representation 

of the Lebesgue measure dx = r n ~ 1 dr da(u), c.f. |23l p. 87], where da(u) denotes the surface element on the unit 
sphere. By this remark, the expressions in 1221 Appendix A] are valid, with the proviso that ui n will denote the 
volume of the n-dimensional unit ball B = {x £ R n : < 1}. 
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In section 01 we present another Cramer- Rao type inequality which is also saturated by the 
generalized g-Gaussian. This inequality has been originally established by [17] , and extended to the 
multidimensional case in |18| and independently in |22) in the case of an Euclidean norm. We show 
here that this last inequality can readily be stated and proved in the case of an arbitrary norm. 

In order to derive these different inequalities, we will need some preliminary results, in 
particular concerning some properties of general norms on R™. This is the objective of section 
[5] where we first define the notion of dual norms and prove a result on the gradient of a general 
norm. Then, we establish, together with its equality conditions, a general Holder-type inequality. 
This inequality will be an essential ingredient in the derivation of the general Cramer- Rao inequality 
©■ 

2. Preliminary results 

As mentioned above, we will consider here general norms on 1". Let us first simply recall that a 
norm on R" is a function ||.||: R™ — > R + such that for any x, y £ R n and 7 £ R, then 

(a) ||7z|| = |7lH,(b) ||x + < Hxll + H2/II, and (c) ||x|| = iff x = 0. (8) 

A large class of norms is the class of L p -norms, p > 1, given by ||x|| p = (X^iLi Mf ) p ■ As important 
particular cases, we have the Li-norm, ||x||i = \ x \i- ^ ne max-norm or Loo-norm ||x||oc = 

max (|xi|, . . . |x„|) , and of course the Euclidean L2-norm ||x||2 = (J27=i x i) 2 ■ We shall mention 

that it is possible to use weightened versions of the norms above, e.g. 11x11^ = w i\ x \i) p , with 

Wi > 0, and that any injective linear transformation A leads to a new norm, such as \\x\\a — ||^x||. 
Finally, it is also possible to construct new norms by combining different norms defined on subvectors 
of x. 

A related important notion is the notion of dual- norm. Let E = (R n , ||.||) a n-dimensional 
normed space, where ||.|| is an arbitrary norm, and let us denote E* = (R™, || . || ^) its dual space. 
For Y £ E*, the dual norm ||.|| + is defined by 

= sup X.Y, (9) 

I|X||<1 

where X.Y is the standard scalar product X.Y — X)"=i XiYi. In particular, it is well known that if 
|.| is a Lp-norm, then ||.|| t is the L g -norm, where p and q are Holder conjugates of each other, i.e. 
p^ 1 + q^ 1 = 1, see e.g. [2H chapter 5]. By a direct consequence of the definition of the dual norm, 
we always have 

*.r<p-||||y||,. (io) 

Note that when the dot product X.Y is negative, we can always take the minus of one of the 
elements to get \X.Y\ = X. (-Y) < \\X\\\\ — Y\\* = ||X||||Y||*. Hence, we see that we actually have 
an extension of Holder's inequality for vectors: 

|XF|<|pq||yL. (11) 

Obviously, we recover here the Cauchy-Schwarz inequality if |.| = ||.|| 2 and the standard Holder 
inequality for vectors if ||.|| = ||.|| , and thus || . || ^ = \\.\\ q - 

In the following, we will need several facts on the gradient of a norm. These facts are stated 
in the next Lemma. 
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Lemma 1. Let |[.|| be differ entiable at x £ E, and denote x* = \7 X ||.|| (x) G E* the gradient of the 
norm atx. The gradient o/||x|| satisfies (a) x.x* = \\x\\ and (b) ||^ = 1. Furthermore, when the 
dual norm \\.\\ f is strictly convex, then the gradient x* is the unique vector that satisfies (a) and 
(b). 

Proof. We begin by equality (a). Let x and v two vectors of E and A a real parameter. By the 
triangle inequality, we have ||x + Aw|| < ||x|| + A ||i>||, so that 

lim l^ + HMMl < HI . (12) 

A^O A — M ii 

In the other hand, the chain rule for derivation d j|^ = 4f -Vz \\Z\\ , with Z = x + Xv, gives 

d \\x + Xv\ 



dA 



v.V x \\x\\ < H| , (13) 



A=0 



where the right inequality follows from (|12p . Of course, taking v = x in (|12[) gives the equality sign, 
and (|13|) becomes x.V x \\x\\ = ||x||, that is (a). 
By (fT3|) . we also get that 

V.W./ii^l, (14) 

INI 

with equality if v = x. Therefore, in the definition of the dual norm || V x ||x|| ||^ = sup^n^ w.V ' x \\x\\, 
the supremum is equal to one and is attained for w — xj \x\. This proves (b). 

In functional analysis, the existence of a solution to a system analogue to (a),(b) is granted 
by a consequence of the Hahn-Banach theorem, see e.g. the background material in |25) . In this 
context, the uniqueness of extension in the Hahn-Banach theorem, therefore the uniqueness of x*, 
is guaranteed if the primal space E is smooth, which in turn is equivalent to the strict convexity 
of the dual norm [26 , Chapter 2]. In our setting, it is easy to check that we have uniqueness of 
the solution to the system (a),(b) provided that the dual norm is strictly convex. Indeed, if x\ 
and x% are two solutions to (a),(b), we have by (b) -pjj. {x\ + x^) = 2. Accordingly, the dual norm 

+ ^2 II * = su P|H|<i w - ( x i +^2) 1S necessarily greater than 2: ||a;* +X2II* — 2- In the other 
hand, if the dual norm is strictly convex and using (a), we have ||x^ + #2 II* — \\ x i\\* + ll x 2ll* = 2, 
with equality if and only if x\ = x* 2 . Combining the two inequalities, we see that the two solutions 
are necessarily equal. Finally, since we have already identified that the gradient of the norm satisfies 
(a),(b), we get the last item in the Lemma. □ 

The standard Holder's inequality works for functions and relates the L\ norm of the product 
of two functions to the product of their L p and L q norms: ||/<?||i < ||/||p|M|q, with 1 < p, q < 00 
and 1/p+l/q = 1. For vectors and an arbitrary norm ||.||, the inequality (fTTI) gives another 
kind of Holder inequality (actually, this inequality is also true in a broader context, see e.g. |27J). 
By combining these two inequalities, we obtain another Holder- type inequality for vector- valued 
functions, which involves arbitrary norms. This inequality will be a key in the derivation of the new 
multidimensional Cramer-Rao inequality. It is given, with its equality condition, in the following 
Lemma. 

Lemma 2. Let E = (R™, ||.||) be a n-dimensional normed space and E* = (IR™, ||.|| t ) its dual space. 
If X(t) and Y(t) are two functions taking values respectively in E and E* , and if w(t) is a weight 
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function, then 



\\X(t)\\ a w(t)dt 



\\Y(t)gw(t)dt) > / \X(t).Y(t)\ w(t)dt 



> 



X(t).Y(t)w{t)dt 



(15) 



(16) 



with a and /3 Holder conjugates of each other, i.e. a 1 + /3 = 1, a > 1. The equality is obtained 
if 

Y(t)=K\\X(t)p-'V x(t) \\X(t)\\, (17) 

with K £ K for inequality i!5\) . and with K £ M + for the lower bound M6\) . If the dual norm 
is strictly convex, then the function Y(t) in M7\) above is the unique function which saturates the 
inequalities H5^16\) . 

Proof. By inequality (pi]), we have \X{t).Y(t)\ < \\X(t)\\ \\Y(t)\\*. Integrating this inequality with 
respect to t, we obtain 



Obviously, we always have 



X{t).Y{t)\ w(t)dt< / \\X(t)\\\\Y(t)\\Mt)dt. 



X(t).Y(t)w(t)dt 



< 



\X(t).Y(t)\ w(t)dt, 



(18) 



(19) 



with equality if X(t).Y(t) > everywhere. Then, it only remains to apply the standard Holder 
inequality to right hand side of (fTBll : 



X(t)\\\\Y(t)\\Mt)dt< ( f \\X(t)\\*w(t)dtY ( f \\Y(t)gw(t)dt) 



(20) 



to obtain §Fj$. The inequality $Vo§ then follows by (fH?)) . 

As far as the cases of equality are concerned, we know that in the Holder inequality J20[) . the 
equality is obtained if and only if = -ftT||X(i)|| Q , with K a positive constant. Using the fact 

that a/P = a — 1, the condition becomes ||y(£)||* = if||X(t) || Q_1 . This condition implies that Y(t) 
must be of the form 

Y(t)=K\\X(t)\\ a - 1 u, (21) 

where u is a vector of E* with unit norm: = 1. By inequality (fTTT) we see that the integrand 

in the left side of (fT8j) is always less or equal the integrand on the right. Therefore, we will only get 
equality in (fT8|) if the integrands are equal. Then if we plug (|21ll in the inequality (fTT|) . we obtain 
the condition HA^t)!!"" 1 \X{t).u\ = \\X{t)\\ a , that is finally \X(t).u\ = \\X(t)\\. Since we know by 
Lemma [T] that v — V x(t)\\X(t)\\ is a unit vector that satisfies X(t).v — ||A(i)||, and is unique if the 
dual norm is strictly convex, we see that u = ±t> = ±Vx(t) ||A(i)|| and this concludes the proof of 
the first inequality. For equality to hold in the lower bound (fT6j) . the integrand must be positive, 
which in turn implies that X(t).u = ||A(i)|| and u — V x(t)\\X{t)\\. □ 
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3. The generalized Cramer-Rao inequality 

In this section, we first derive a main Cramer- Rao inequality for the estimation of a multidimensional 
parameter and introduce a generalized version of Fisher information. Next, we examine the 
particular case of a pair of escort distributions, and then the case of a location parameter. So 
doing, we obtain multidimensional versions of the g-Cramer-Rao inequality and a Cramer-Rao 
inequality characterizing generalized g-Gaussian distributions. 



3.1. The main Cramer-Rao inequality for the estimation of a parameter 

The problem of estimation is to determine a function 9{x) in order to estimate an unknown 
parameter 9. Let f(x; 9) and g(x; 9) be two probability density functions, with x G X C M fe 
and 9 a parameter of these densities, 9 G M™. An underlying idea in the statement of the new 
Cramer-Rao inequality is that it is possible to evaluate the moments of the error with respect to 
different probability distributions. For instance, in the estimation setting the estimation error is 
9{x) — 9. The bias can be evaluated with respect to / according to 



B f {9) = j (§(x) - (?) f(x; 9) dx = E, 9{x) - 9 



(22) 



while a general moment of a norm of the error can be computed with respect to another distribution, 
g(x; 9), as in 





E„ 



9{x) 



x 



9{x) 



g(x; 9) Ax 



(23) 



The distributions f{x;9) and g(x,9) can be chosen arbitrarily and are not necessarily directly 
related. However, g(x; 9) can be designed as a transformation of f(x; 9) that highlights, or perhaps 
scores out, some characteristics of f(x; 9). Typically, g(x; 9) can be a weightened version of f(x; 9), 
i.e. g(x; 9) = h(x; 9)f(x; 9). The distribution g(x; 9) can also be a quantized version of f(x; 9), such 
as g(x; 9) = [f(x; 9)} , where [.] denotes the integer part. Another important special case is when 
g{x] 9) is defined as the escort distribution of order q of f(x; 9), where q plays the role of a tuning 
parameter. We will see that this special case, which is particularly important in the context of 
nonextensive statistical physics, will lead to generalized q-Gaussians as the extremal functions. We 
are now in position to state and prove an extended version of the Cramer- Rao inequality. 

Theorem 1. Let f(x; 9) be a multivariate probability density function defined over a subset XC R™, 
and 9 G C R k a parameter of the density. The set 6 is equipped with a norm ||.|j, and the 
corresponding dual norm is denoted ||.|j*. Let g(x;9) denote another probability density function 
also defined on (A;0). Assume that f(x;9) is a jointly measurable function of x and 9, is integrable 
with respect to x, is absolutely continuous with respect to 9, and that the derivatives with respect to 
each component of 9 are locally integrable. For any estimator 9{x) of 9, we have 



E 



9{x) 



W\g;0]e > \n + Vg.B f {9)\ 



(24) 



with a and (3 Holder conjugates of each other, i.e. a 1 = 1, a > 1, and where the ((3, g)-Fisher 

information 



i [f\g;O] = 



X 



g(x-9) 



g(x;9) dx 



(25) 
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is the generalized Fisher information of order j3 on the parameter 9 contained in the distribution f 
and taken with respect to g. The equality case is obtained if 



Vef(x;8 
g(x-9) 



= K 



9(x) 



(26) 



with K > 0. 



Proof. The bias in (|22[) is a n-dimensional vector. Let us consider its divergence with respect to 
variations of 9: 

div B f (0) =Wg.B f {9). (27) 

The regularity conditions in the statement of the theorem enable to interchange integration with 
respect to x and differentiation with respect to 9, and 



x 



(§(x) 



f(x-,e)dx 



X 



9 f(x;0).(§( 



x)-6) dx 



(28) 



In the first term on the right, we have V#.# = n, and the integral reduces to — n J„ f(x; 9) dx = —n, 
since f(x; 9) is a probability density on X. The second term can be rearranged so as to obtain an 
integration with respect to the density g(x;9), assuming that the derivatives with respect to each 
component of 9 are absolutely continuous with respect to g(x;9), i.e. g(x;9) 3> Vef{x;9). This 
gives 

n + Vg.B f {9)= — - . 

Jx 9[x;6) 

Now, it only remains to apply the generalized Holder-type inequality (|16p in Lemma[5]to the integral 
on the right side, with X(x) = 9{x) - 9, Y(x) - YiIS^l 
generality 



9(x)-9j g(x;9)dx. (29) 

>e inequality (|I6p in Lemma[5]to the integral 
' an< ^ w ( x ) — 9( x t@)- This yields in all 



9(x) 



g(x;9) dx 



x 



Ve/(a;; 



g(x;9) dx > \n + V e .B f (9)\ 



(30) 



which is By Lemma [5] again, we know that the case of equality occurs if Y(t) = 

Jif||Jf(t)|| a - 1 Vjc(t)||^(*)||, K > 0, which gives ®. □ 

3.2. Main consequences of the general result 

3.2.1. Case of a q-escort distribution Let f(x; 9) and g(x; 9) be a pair of of q-escort distributions 
linked by 



me) 



and g{x; 9) = 



M q [g;6] ^ ' ' M q [f;6]' 

with q > 0, q = 1/q, and the information generating function M q [g;9] defined by 



M q [g;9] = f g(x;9)dx. 
Jx 



(31) 



(32) 



As usual, we will denote by E q [.] the ^-expectation, which is the expectation taken with respect to 
an escort distribution of order q. Here we see that the expectation with respect to f(x; 9) is also the 
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(/-expectation with respect to g(x;0). Let us also recall that the inverse function of the deformed 
g-exponential ([7]), the so-called g-logarithm, is defined by 



ln q (x) := 



x x - q - 1 
1-9 



(33) 



With these notations, we have the following corollary of the general Cramer-Rao inequality. 
Corollary 1. For the pair of escort distributions h31)) , the equivalent Cramer-Rao inequalities 



E 



E„ 



e(x) 

9{x) 



h, 9 if\g;0V > 
h A \f\g;0]7 > 



n + Ve-E a 



n + Ve-E 



hold, where the generalized (f3,q)-Fisher information is given by 



h,i \fWA = 



M q [g; I 



E 



g(x;9)^-V 



V« In 



9{x)- 

e\x)-e 
9 {x-ey 



M q [g;9] 



= Mq [/; Of E q \f(x- || V 9 In f(x; 9)\\{ 



and where equality occurs if 



V e *Xqf(x;9)=K 9{x) - 9 



a-l 



(34) 
(35) 

(36) 
(37) 

(38) 



with K > 0. 



Proof. The Cramer-Rao inequalities (|34|) and (|35|) directly follow from the general Cramer-Rao 
inequality the relations (|3ip between f(x;9) and g(x;9), and the notation of g-expectations. 
The expressions of the generalized (f3, g)-Fisher information also follow by direct calculation. Finally, 
the equality condition yields 



f(x;9)^V e liif(x;9) =K 9{x) 



9{x)-e 



\\9{x)-9\\. 



(39) 



Noticing that the term on the left is nothing but the gradient of the deformed (^-logarithm, 
f(x; 9)^-^V e lnf{x; 9) = V e In, f(x; 9), we immediately obtain (38]). □ 



3.2.2. Case of a translation family In the particular case of a translation parameter, the 
generalized Cramer- Rao inequality induces a new class of inequalities. 

Let 9 £ W 1 be a location parameter, x € X C R", and define by f(x; 9) the family of density 
f(x; 9) = f(x — 9). In this case, we have V#/(x; 9) = —V x f(x — 9), provided that / is differentiable 
at x — 9, and the Fisher information becomes a characteristic of the information in the distribution. 
If X is a bounded subset, we will assume that f(x) vanishes and is differentiable on the boundary 
dX (otherwise the Fisher information defined for the function extended to R™ is not defined). 

Let us denote by /if the mean of f{x). We immediately get that the mean of f(x; 9) is (fir + 9), 
so that an unbiased estimator of 9 could be 9(x) = x — ^f. If we choose 9{x) = x, the estimator 
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will be biased, Bf(9) =E q 6(x) — 9 = Hf, but independent of 9, so that the gradient of the bias 
with respect to 9 is zero. In these conditions, the generalized Cramer-Rao inequality becomes 



\\x-9\\ a g(x;9) dx 




g(x;9) 



W 
g(x; 9) dx \ > n. 



(40) 



Furthermore, we can also choose 9 — 0, and obtain, as a corollary, the following interesting functional 
inequality. 

Corollary 2. Let f{x) and g{x) be two multivariate probability density functions defined over a 
subset X o/R™. Assume that f(x) is a measurable differ entiable function of x, which vanishes and 
is differentiable on the boundary dX , that V x f(x) is absolutely continuous with respect to g(x), and 
finally that the involved integrals exist and are finite. Then, the following inequality holds 



\x\\ a g(x) dx 




V x /(x) 



g(x) dx 



> n, 



(41) 
(42) 



ix / \Jx 

with equality if (and only if when the dual norm is strictly convex) 

V x f(x) = -Kg(x)\\x\\ a - 1 V x \\x\\, 
with K > 0. 

As an elementary application, let us consider the univariate case, with X = [0,1]. Let us 
take for g(x) the uniform distribution on the interval. Finally, let us choose for f(x) a /3- 
distribution: f(x) = x a_1 (l — x) fc_1 /i?(a, b), with B(a,b) the /3-function. Firstly, we obviously 
have Jg 1 |a;| Q da; = 1. Secondly, f(x) = ((a - \)x a - 2 {\ - xf- 1 + (b - - xf- 2 ) , so that 

the inequality is 

\(a - l)x a - 2 (l - xf- 1 + (b-l)x a - 1 (l-x) b - 2 \ 13 dx^j" >B(a,b). (43) 

Taking now (3—1 and a > 1, b > 1, we obtain the following inequality for /3-i unctions: 

(a - l)B{a - 1, b) + (b - l)B(a, b - 1) > B(a, b) (44) 

3.2.3. Case of a location parameter within a pair of escort distributions By combining the two 
aspects presented above, namely the case of a pair of escort distributions and the case of a location 
parameter, we will obtain a new Cramer-Rao inequality saturated by multivariate generalized q- 
Gaussians. This provides a new information theoretic characterization of generalized gr-Gaussian 
and extend our previous results to the multivariate case and arbitrary norms. As in Corollary[TJ we 
use a pair of of g-escort distributions: 



M q [g] 



and g (x) = 



M- q [fY 



(45) 



with q — 1/q, and we denote by E [.] the standard expectation with respect to g{x), and by Eg [.] 
the g-expectation with respect to f(x), which is simply the standard expectation taken with respect 
to the escort f(xY/Mq [f] . In the statement of the following corollary, we will use the deformed 
exponential and logarithm defined in (|7|) . (|33|) . We will also use the notation q* — 2 — q that changes 
the quantities (1 — q*) into (q — 1). 
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Corollary 3. Let g{x) be a multivariate probability density function defined over a subset X C R™. 
Assume that g(x) is a measurable differentiable function of x, which vanishes and is differentiate 
on the boundary dX , and finally that the involved integrals exist and are finite. Then, for the pair 
of escort distributions |^5| ] ; the following q- Cramer-Rao inequality holds 



with 



m a [g\" Ip, q [g] 13 >n 
m a [g} = E[\\x\\ a ] 

Ip, q [g] = (q/M q [g]f E [g{xf^ ||V a lng(x)£ 
= {q/M q [g]f E \\\V x hx q *g{x)\\i 



(46) 



(47) 



where a and j3 are Holder conjugates of each other, i.e. a" 1 + /3" 1 — 1, a > 1, and where Ip >q [g] 
denotes the generalized (f3,q)-Fisher information. 

In terms of q- expectations with respect to f{x), it can also be written 



m a<q [f}« I Ptq [/]*>« 



(48) 



with 



,[f]=E,[\\x\n 



f(xfV-V\\V x lnf(x)£ 



(49) 



\V X \Ugf(x) 



Ip, q [f]=M q [ff E q 
= M q [ff E q 
In both cases, equality occurs if 

f(x)(xexp q (—j\\x\\ a ), or equivalently g(x) oc exp^ (— 7 ||ie|| q ) , with 7 > 0. (50) 

If the dual norm is strictly convex, then this generalized q-Gaussian is the unique probability density 
function that achieves the equality in the extended Cramer-Rao inequalities. 

Proof. As indicated above, the result is a direct consequence of TheoremQ] in the case of a pair of 
escort distributions and of the estimation of a location parameter, with 9{x) — x and 9 = 0. Using 
(|4"S")) . the condition for equality (|4"2")l becomes 



ffW'V.sW = -KWxr-^zWxWgix). 



(51) 



From this equation, we see that g(x) will only be a function of the norm of x, and therefore will be 
radially symmetric. Furthermore, we see that the gradient of g(x) behaves as the negative of the 
gradient of ||x||. This means that g(x), which is a function of ||x||, is non increasing with ||x[[. 
For g(x) 7^ 0, the equality condition can be written 



g(x)"- 2 V x g(x) = 



J—V x g( x y- 1 = --V x \\x\\ a , 
q — 1 a 



which, after integration of the two sides, gives 



.K(q-l)\\x\\ a + C, 
a 



(52) 



(53) 
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where C is a constant of integration. Since g(x) is a probability density function, the solution is 
restricted to the domain where the right hand side is non negative, and g(x) = elsewhere. In 
particular, when C is negative, we see that g(x) vanishes around the origin and presents a singularity 
at ||x|| = Ca/K(q — 1). Since we assumed g(x) differentiable everywhere, this solution must be 
discarded. 

Therefore, the constant of integration C must be positive, and 

9{x) oc (l - §^{q - 1)11*1!") 9 1 oc exp 9t (- 7 \\x\\ a ) (54) 

which is (|50|) . The expression of f(x) simply follows from the fact that f(x) is the escort distribution 
ofg(x). □ 

4. Another Cramer-Rao inequality saturated by generalized Gaussians 

We finish this paper with another Cramer-Rao type inequality, which involves a variant 4>p >q [g] 
of the generalized Fisher information Ip yq [g] above, and which is also saturated by generalized 
q-Gaussian distributions. However, this inequality is less directly related to estimation results 
than the inequality (|4^)1 which is just a special case of the general Cramer-Rao inequality. The 
monodimensional version of this inequality has been established by [17J, and extended to the 
multidimensional case in |18| and in |22| in the case of an Euclidean norm. Actually, the inequality 
can readily be stated and proved for a general norm. 

Theorem 2. For n > 1, ft and a Holder conjugates of each other, a > 1, q > 
max{(n — l)/n, n/{n + a)} then for any probability density g on M™, supposed continuously 
differentiable and such that the involved information measures are finite, 

m a [g]" <j) 0>q > m a [G]° <j> PA [G\h (55) 
with A = n(q — 1) + 1 and where the general Fisher information is given by 

cf > ^[g} = (M q [g]/qfl p , q [g}= E [g(x)^ \\V x lng(x)t] , (56) 

and where the equality holds iff g is a generalized Gaussian g — G 7 . 

For the proof of this inequality, we will use two general inequalities relating the moment 
m a [g], the generalized Fisher information 4>p tq [g] and the information generating function M q [g] = 
J g(x) q dx. We will also use the notation N q [g] — M q [g] ~ , which is known as the "Renyi entropy 
power". 

Lemma 3. For n > 1, a £ (0,oo), q > n/(n + a), and if g is a probability density on random 
vectors ofM 71 with m a [g] = _E[||a;|| Q ] < oo, A^[g] < oo, then 

m a [g}« > m a [G]i 
N q [g]i ~ N q [G]i ' 

with equality if and only if g is a generalized Gaussian. 



Proof. The inequality (|57|) has been stated and proved in [28] in the case of an Euclidean norm. 
We simply note here that the proof in [28] works as well in the case of a general norm. □ 
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We will also use a generalized Stam inequality derived from a general sharp Gagliardo-Nirenberg 
inequality proved in the remarkable paper of Cordero et al. |10] . 

Lemma 4. For n > 1, ft and a Holder conjugates of each other, a > 1, and q > 
max{(n - l)/n, fi/(n + a)}, then for any probability density on W 1 , supposed continuously 
differentiable, the following generalized Stam inequality holds 

\9\ ** N g [g] * > <f>p i9 [G] A Nq [G] * , (58) 
with X — n{q — 1) + 1 and with equality if and only if g is any generalized q-Gaussian (0). 
Proof. In our notations, the sharp Gagliardo-Nirenberg inequality [TUl Eq. (34) p. 320], with a > 1 
and | Vu\\/3 = ||Vu||*dx^ ^ , is 



where K is a sharp constant which is attained if and only if u is a generalized Gaussian with 
exponent 1/(1 — a), and where 9 is given by 9 = n(a — l)/a(n(3 — (a/3 + 1 — a)(n — f3)). The idea is to 
take u = <7*, g being a probability density function, with a/3t = 1, and to note q = [a(f3 — 1) + 1] t. 
With these notations, we get that /3t = f){q — 1) + 1, and (|59l) becomes 

<WsF ^[flf" 1 ' 9 ' 1 ""' (60) 

Simplifying the expression of 9 and the exponent in (|60[) . we finally obtain, with g < 1, the 
generalized Stam inequality (|58p . with equality if and only if g is any generalized g-Gaussian ([B]). 
Actually, this generalized Stam inequality is also valid in the case g > 1, as it can be checked from 
[10] 's results using similar steps as above. The conditions on q simply ensure the existence of the 
information measures for the generalized Gaussian. □ 

We end with the proof of theorem[2l which is now an easy task. 

Proof. The Cramer-Rao inequality (|4T))) can also be written 

ma[9] jmm- 1 - (61) 

Eliminating M g [g] between this inequality and the moment-entropy inequality (fSTf with q > 1, 
we arrive at (|55p . Similarly, in the case q < 1, the elimination of M 9 [g] between the extended 
g-Cramer-Rao inequality for a location parameter (|6ip and the generalized Stam inequality (I58[) 
also leads to (|55|) . The case of equality directly follows from the cases of equality in the initial 
inequalities. Alternatively, we can observe that (|55|) also follows at once by the combination of (fSTj) 
and (El. □ 
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5. Conclusions 

This paper complements and improves our previous findings presented in |19) . We connect concepts 
in estimation theory to tools used in nonextensive thermostatistics and establish general Cramer- 
Rao type inequalities valid for estimation purposes. These results are given in the mutidimensional 
case, and a feature of our approach is that it works for arbitrary norms on W 1 . As a direct 
consequence, we obtain multidimensional versions of our g-Cramer-Rao inequalities, which includes 
the Barankin-Vajda as well as the standard Cramer-Rao inequality as particular cases. Furthermore, 
in the case of a translation family, we have shown that the corresponding Cramer-Rao type 
inequality is saturated by multidimensional gr-Gaussian distributions. We have also presented a 
related general Cramer-Rao inequality which is saturated by the same g-Gaussian distributions. 
These results imply in particular that the generalized q-Gaussians are the minimizers of an extended 
version of the Fisher information, among all distributions with a given moment, just as the standard 
Gaussian minimizes Fisher information over all distributions with a given variance. Since these 
generalized Gaussian are already known to be the maximum entropy distributions for Renyi or 
Tsallis entropies, this yields a new, complementary, information theoretic characterization of these 
distributions. 

As is well-known, the Weyl-Heisenberg uncertainty principle in statistical physics corresponds 
to the standard Cramer-Rao inequality for the location parameter. Thus it would certainly be 
of interest to investigate on the possible meanings of the uncertainty relationships that could 
be associated to the extended Cramer-Rao inequalities. Another open issue is the study of the 
potential convexity properties of the generalized Fisher information. Indeed, the standard Fisher 
information is a convex function of the density. If this were also true for any value of q, then it 
would be possible to associate to the generalized Fisher information a statistical mechanics with the 
standard Legendre structure and with the q-Gaussian as canonical distribution. Finally, in their 
recent work |18| . Lutwak et al. have introduced an abstract, implicit, notion of Fisher information 
matrix attached to a probability density. It would be of interest to examine whether this notion 
could be extended and interpreted in the estimation theory framework. 
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